Skip to main content

View Post [edit]

Poster: samou812 Date: Mar 5, 2019 12:59pm
Forum: 78rpm Subject: 78 rpm upload guidelines *UPDATE*

Well, I don't have much hope of getting an answer to my update of this question, since there were no replies to the original (is there actually anyone out there?) but here goes, anyway.
I am cataloging my collection by: major label; minor label (e.g., "Victor" "Black":) catalog number; title; and major artist. Each "side" is an individual entry. I'm doing this one container at a time, since my containers are stacked. I've finished cataloging and researching the first, which contained 180 "sides" (all 10" records.) The object of my exercise is to identify which sides already have a digital version uploaded to IA. I had fairly good luck by searching on the catalog number directly. I found that about 80% of my sides were already present on IA. However, on some of the sides I didn't find by catalog number, it appeared that an identical performance was present on a different record under a different number, usually paired with a different "other" side. I guess the operative term today would be "re-release." From my point of view, I don't have any real desire to upload those re-releases. Does anyone have any ideas about ways to identify those duplicate performances beyond the obvious matches of song titles and performers?

Also, I have a smaller number of non-rock 'n' roll 45s from the advent of that format. Is there any archive collection of those records at IA similar to the collections of 78s?

************************* ORIGINAL POST **************************
Hello. I have a collection of several hundred 78s inherited from my grandmother and great aunt many years ago. I have the equipment and software needed to scan labels and digitize the audio, and I finally have some time to begin this project. I do have two questions: one general; one specific. I haven't yet found any documentation on-site with guidelines for the digitizing and uploading process (including what cataloging information or metadata to include, and in what format.) I'm sure there must be at minimum, a list of best practices somewhere. Can someone supply me a link? Also, while I am absolutely willing to upload my results, I see no sense in doing so if a good quality copy of the identical record already exists at Can someone tell me the most efficient way to search for existing copies?

This post was modified by samou812 on 2019-03-05 20:40:02

This post was modified by samou812 on 2019-03-05 20:57:15

This post was modified by samou812 on 2019-03-05 20:58:13

This post was modified by samou812 on 2019-03-05 20:59:16

Reply [edit]

Poster: brewster Date: Mar 5, 2019 1:52pm
Forum: 78rpm Subject: Re: 78 rpm upload guidelines *UPDATE*

"duplicate" is unfortunately a fuzzy matter. We use label and catalog number, then look at the titles/performers to do deduplication on donated items to the Internet Archive.

So we will have the same matrix multiple times on different labels.

But it sounds like you are doing the right thing.

BTW, if you donate the items to the Internet Archive we will preserve them and digitize them.

Reply [edit]

Poster: samou812 Date: Mar 5, 2019 4:40pm
Forum: 78rpm Subject: Re: 78 rpm upload guidelines *UPDATE*

Thanks. Donation is an option I will definitely keep in mind, after I finish cataloging and eliminating what you already have in your collection. It will largely depend on how many records I would need to ship, since that would appear to be a non-trivial consideration for 78s. I do have equipment here that, while I'm certain it doesn't rival what you have at your disposal, should allow me to do an adequate or better job of digitization, if I decide there are too many records to ship. I would certainly upload the digitized results. I found 80% of my first sample in your collection, after de-duplication, that will probably be closer to 90%. I have about 500 78 RPM records, so if that holds there would be about 50 records for potential donation. That would seem to be a manageable number to ship.

Reply [edit]

Poster: brewster Date: Mar 6, 2019 10:35am
Forum: 78rpm Subject: Re: 78 rpm upload guidelines *UPDATE*

we would appreciate the donation (and the pre-deduplication). it helps.

we are also seeing high % of duplicates.

Reply [edit]

Poster: BNRToast Date: Mar 6, 2019 2:10pm
Forum: 78rpm Subject: Re: 78 rpm upload guidelines *UPDATE*

Does a list or spreadsheet exist that can be used to identify duplications?

Reply [edit]

Poster: brewster Date: Mar 6, 2019 4:58pm
Forum: 78rpm Subject: Re: 78 rpm upload guidelines *UPDATE*

Yes: this has made certain tasks easy for me

Reply [edit]

Poster: samou812 Date: Mar 7, 2019 6:39am
Forum: 78rpm Subject: Re: 78 rpm upload guidelines *UPDATE*

I downloaded it as MS Excel, and I'm not certain exactly what I'm looking at. Perhaps something got lost in the conversion from native.

I mentioned that in the first case of 78s I went through, I found that 80% of the 180 catalog numbers were already uploaded to IA and digitized, with a high probability that many of the remaining items had identical performances present under a different catalog number. Those were all naked 10" discs that appeared to have been marketed individually. Yesterday I went through my second container that consisted of mostly 12" records, mostly in album form, and the results were quite different. I didn't find *any* of the catalog numbers, and I didn't see evidence that many at all of the performances are present with different identification. Two examples: a 2-disc album titled "The Birds (Gli Ucelli)" on Victor Red (cat. 11-8945 & 11-8946) performed by Chicago Symphony Orchestra directed by Desire Defauw; a 3-disc album on Sonora Red (cat. 19004; 19005; 19006) of Tschaikovsky's "Nutcracker Suite" performed by the Havana Philharmonic directed by Massimo Freccia. I probably should note that my intention is to make an initial pass through the entire collection (6 ea containers with ~80 - 90 records each,) and discard any records that have positive matches already on IA. Then I will make a second, more diligent pass to eliminate identical performances under different catalog numbers. When I reach that point, if I upload scanned label images of the "survivors" is there anyone who can make further analysis of whether or not the performances exist on IA, or will my analysis already be "state of the art" at that point? Also, my research suggests I may have some labels and album jackets that are missing (or at least are not shown in the IA collection) for performances that are already present. Would there be any value in my furnishing images of those items without the audio (I have a large format scanner at my disposal?) If so, what would be the best way to submit those images? Sorry for the long post, once I start writing on something that interests me, the ideas sort of leap out of my head... Thanks.

This post was modified by samou812 on 2019-03-07 14:31:18

This post was modified by samou812 on 2019-03-07 14:32:34

This post was modified by samou812 on 2019-03-07 14:36:39

This post was modified by samou812 on 2019-03-07 14:39:18

Reply [edit]

Poster: brewster Date: Mar 7, 2019 6:43am
Forum: 78rpm Subject: Re: 78 rpm upload guidelines *UPDATE*

What might explain this is our physical collecting criterion looks for all discs we do not have, but our current digitization funding de-prioritizes classical albums and10" records.

Since we do not catalog the discs that have not been digitized, we do not yet have a way to know which classical discs we have.

Reply [edit]

Poster: samou812 Date: Mar 13, 2019 12:37pm
Forum: 78rpm Subject: Re: 78 rpm upload guidelines *UPDATE*

OK. I have finished cataloging and de-duplication research for these records. Total count was 398 (physical records.) Count of unique physical records not currently appearing on is 125. Nearly all of those are 12". About half are in intact albums, the remainder are "loose" (I have slip covers for 10" records, but not 12".) I'm ready to at least explore the possibility of donating the lot of 125. I do have more questions, pursuant to that:
1. Is there any way for someone at to look over my list (csv attached,) and at least do a sanity check on records that you have in your possession, but have not yet scanned and listed, to possibly further reduce the number?
2. Most of the records appear to be in good condition. There are a couple that are warped, actually more "dished" than "warped" (symmetric.) A few have significant dust. A couple have some vestiges of what could be mold or mildew. Are there any steps that I need to take to clean or flatten records before shipment?
3. What is your advice on shipping?
4. What other details need to be taken care of for this donation?


This post was modified by samou812 on 2019-03-13 19:37:31

Attachment: CLARA_and_AGNES_78_CATALOG_NOT_ON_IA.csv

Reply [edit]

Poster: BNRToast Date: Mar 7, 2019 3:08am
Forum: 78rpm Subject: Re: 78 rpm upload guidelines *UPDATE*

Very useful, thanks. I use a similar sheet for our Bob and Ray materials.
Perhaps a version could be placed in the 78 RPM Archive collection and updated occasionally.

And thanks again for the Internet Archive.